Rapid Rule Compaction Strategies for Global Knowledge Discovery in a Supervised Learning Classifier System

نویسندگان

  • Jie Tan
  • Jason H. Moore
  • Ryan J. Urbanowicz
چکیده

Michigan-style learning classifier systems have availed themselves as a promising modeling and data mining strategy for bioinformaticists seeking to connect predictive variables with disease phenotypes. The resulting ‘model’ learned by these algorithms is comprised of an entire population of rules, some of which will inevitably be redundant or poor predictors. Rule compaction is a post-processing strategy for consolidating this rule population with the goal of improving interpretation and knowledge discovery. However, existing rule compaction strategies tend to reduce overall rule population performance along with population size, especially in the context of noisy problem domains such as bioinformatics. In the present study we introduce and evaluate two new rule compaction strategies (QRC, PDRC) and a simple rule filtering method (QRF), and compare them to three existing methodologies. These new strategies are tuned to fit with a global approach to knowledge discovery in which less emphasis is placed on minimizing rule population size (to facilitate manual rule inspection) and more is placed on preserving performance. This work identified the strengths and weaknesses of each approach, suggesting PDRC to be the most balanced approach trading a minimal loss in testing accuracy for significant gains or consistency in all other performance statistics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Extended Michigan-Style Learning Classifier System for Flexible Supervised Learning, Classification, and Data Mining

Advancements in learning classifier system (LCS) algorithms have highlighted their unique potential for tackling complex, noisy problems, as found in bioinformatics. Ongoing research in this domain must address the challenges of modeling complex patterns of association, systems biology (i.e. the integration of different data types to achieve a more holistic perspective), and ‘big data’ (i.e. sc...

متن کامل

Pareto Inspired Multi-objective Rule Fitness for Noise-Adaptive Rule-Based Machine Learning

Learning classifier systems (LCSs) are rule-based evolutionary algorithms uniquely suited to classification and data mining in complex, multi-factorial, and heterogeneous problems. The fitness of individual LCS rules is commonly based on accuracy, but this metric alone is not ideal for assessing global rule ‘value’ in noisy problem domains and thus impedes effective knowledge extraction. Multi-...

متن کامل

A New Hybrid Architecture for the Discovery and Compaction of Knowledge from Breast Cancer Datasets

This paper reports on a two-fold contribution; first, the introduction of a new compaction algorithm for the rules generated by learning classifier systems that overcomes the disadvantages of previous algorithms in complexity, compacted solution size, accuracy and usability. The second is the new hybrid architecture that integrates learning classifier systems with Rete-based Inference Engines t...

متن کامل

Using Bayesian Classification for Aq-based Learning with Constructive Induction

To obtain potentially interesting patterns and relations from large, distributed, heterogeneous databases, it is essential to employ an intelligent and automated KDD (Knowledge Discovery in Databases) process. One of the most important methodologies is an integration of diverse learning strategies that cooperatively performs a variety of techniques and achieves high quality knowledge. AqBC is a...

متن کامل

AqBC: A Multistrategy Approach for Constructive Induction

In order to obtain potentially interesting patterns and relations from large, distributed, heterogeneous databases, it is essential to employ an intelligent and automated KDD (Knowledge Discovery in Databases) process. One of the most important methodologies is an integration of diverse learning strategies that cooperatively performs a variety of techniques and achieves high quality knowledge. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013